SPORTAL: Profiling the Content of Public SPARQL Endpoints

نویسندگان

  • Ali Hasnain
  • Qaiser Mehmood
  • Syeda Sana e Zainab
  • Aidan Hogan
چکیده

Access to hundreds of knowledge-bases has been made available on the Web through public SPARQL endpoints. Unfortunately, few endpoints publish descriptions of their content (e.g., using VoID). It is thus unclear how agents can learn about the content of a given SPARQL endpoint or, relatedly, find SPARQL endpoints with content relevant to their needs. In this paper, we investigate the feasibility of a system that gathers information about public SPARQL endpoints by querying them directly about their own content. With the advent of SPARQL 1.1 and features such as aggregates, it is now possible to specify queries whose results would form a detailed profile of the content of the endpoint, comparable with a large subset of VoID. In theory it would thus be feasible to build a rich centralised catalogue describing the content indexed by individual endpoints by issuing them SPARQL (1.1) queries; this catalogue could then be searched and queried by agents looking for endpoints with content they are interested in. In practice, however, the coverage of the catalogue is bounded by the limitations of public endpoints themselves: some may not support SPARQL 1.1, some may return partial responses, some may throw exceptions for expensive aggregate queries, etc. Our goal in this paper is thus twofold: (i) using VoID as a bar, to empirically investigate the extent to which public endpoints can describe their own content, and (ii) to build and analyse the capabilities of a best-effort online catalogue of current endpoints based on the (partial) results collected.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SPORTAL: Searching for Public SPARQL Endpoints

There are hundreds of SPARQL endpoints on the Web, but finding an endpoint relevant to a client’s needs is difficult: each endpoint acts like a black box, often without a description of its content. Herein we briefly describe Sportal: a system that collects meta-data about the content of endpoints and collects them into a central catalogue over which clients can search. Sportal sends queries to...

متن کامل

SPARQL Web-Querying Infrastructure: Ready for Action?

Hundreds of public SPARQL endpoints have been deployed on the Web, forming a novel decentralised infrastructure for querying billions of structured facts from a variety of sources on a plethora of topics. But is this infrastructure mature enough to support applications? For 427 public SPARQL endpoints registered on the DataHub, we conduct various experiments to test their maturity. Regarding di...

متن کامل

YummyData: Measuring the "Sparkliness" of Biomedical Sparql Endpoints

Although increasing amounts of biomedical data is being provided as structured content on the Semantic Web, there is currently no standardized way to monitor SPARQL endpoints for their availability, reliability or content flux. Importantly, there are additional issues relating to the provision of version-sensitive data republished by third parties or made available as part of a one off research...

متن کامل

SPARQLES: Monitoring public SPARQL endpoints

We describe SPARQLES: an online system that monitors the health of public SPARQL endpoints on the Web by probing them with custom-designed queries at regular intervals. We present the architecture of SPARQLES and the variety of analytics that it runs over public SPARQL endpoints, categorised by availability, discoverability, performance and interoperability. To motivate the system, we gives exa...

متن کامل

Improving Discovery in Life Sciences Linked Open Data Cloud

Multiple datasets that add high value to biomedical research have been exposed on the web as part of the Life Sciences Linked Open Data (LSLOD) Cloud. The ability to easily navigate through these datasets is crucial for personalized medicine and the improvement of drug discovery process. However, navigating these multiple datasets is not trivial as most of these are only available as isolated S...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. J. Semantic Web Inf. Syst.

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2016